home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
SGI Freeware 1999 August
/
SGI Freeware 1999 August.iso
/
dist
/
fw_xemacs.idb
/
usr
/
freeware
/
lib
/
xemacs-20.4
/
info
/
internals.info-6.z
/
internals.info-6
Encoding:
Amiga
Atari
Commodore
DOS
FM Towns/JPY
Macintosh
Macintosh JP
Macintosh to JP
NeXTSTEP
RISC OS/Acorn
Shift JIS
UTF-8
Wrap
GNU Info File
|
1998-05-21
|
49.9 KB
|
1,228 lines
This is Info file ../../info/internals.info, produced by Makeinfo
version 1.68 from the input file internals.texi.
Copyright (C) 1992 - 1996 Ben Wing. Copyright (C) 1996, 1997 Sun
Microsystems. Copyright (C) 1994, 1995 Free Software Foundation.
Copyright (C) 1994, 1995 Board of Trustees, University of Illinois.
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided that the
entire resulting derived work is distributed under the terms of a
permission notice identical to this one.
Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that this permission notice may be stated in a
translation approved by the Foundation.
Permission is granted to copy and distribute modified versions of
this manual under the conditions for verbatim copying, provided also
that the section entitled "GNU General Public License" is included
exactly as in the original, and provided that the entire resulting
derived work is distributed under the terms of a permission notice
identical to this one.
Permission is granted to copy and distribute translations of this
manual into another language, under the above conditions for modified
versions, except that the section entitled "GNU General Public License"
may be included in a translation approved by the Free Software
Foundation instead of in the original English.
File: internals.info, Node: Markers and Extents, Next: Bufbytes and Emchars, Prev: Buffer Lists, Up: Buffers and Textual Representation
Markers and Extents
===================
Among the things associated with a buffer are things that are
logically attached to certain buffer positions. This can be used to
keep track of a buffer position when text is inserted and deleted, so
that it remains at the same spot relative to the text around it; to
assign properties to particular sections of text; etc. There are two
such objects that are useful in this regard: they are "markers" and
"extents".
A "marker" is simply a flag placed at a particular buffer position,
which is moved around as text is inserted and deleted. Markers are
used for all sorts of purposes, such as the `mark' that is the other
end of textual regions to be cut, copied, etc.
An "extent" is similar to two markers plus some associated
properties, and is used to keep track of regions in a buffer as text is
inserted and deleted, and to add properties (e.g. fonts) to particular
regions of text. The external interface of extents is explained
elsewhere.
The important thing here is that markers and extents simply contain
buffer positions in them as integers, and every time text is inserted or
deleted, these positions must be updated. In order to minimize the
amount of shuffling that needs to be done, the positions in markers and
extents (there's one per marker, two per extent) and stored in Meminds.
This means that they only need to be moved when the text is physically
moved in memory; since the gap structure tries to minimize this, it also
minimizes the number of marker and extent indices that need to be
adjusted. Look in `insdel.c' for the details of how this works.
One other important distinction is that markers are "temporary"
while extents are "permanent". This means that markers disappear as
soon as there are no more pointers to them, and correspondingly, there
is no way to determine what markers are in a buffer if you are just
given the buffer. Extents remain in a buffer until they are detached
(which could happen as a result of text being deleted) or the buffer is
deleted, and primitives do exist to enumerate the extents in a buffer.
File: internals.info, Node: Bufbytes and Emchars, Next: The Buffer Object, Prev: Markers and Extents, Up: Buffers and Textual Representation
Bufbytes and Emchars
====================
Not yet documented.
File: internals.info, Node: The Buffer Object, Prev: Bufbytes and Emchars, Up: Buffers and Textual Representation
The Buffer Object
=================
Buffers contain fields not directly accessible by the Lisp
programmer. We describe them here, naming them by the names used in
the C code. Many are accessible indirectly in Lisp programs via Lisp
primitives.
`name'
The buffer name is a string that names the buffer. It is
guaranteed to be unique. *Note Buffer Names: (lispref)Buffer
Names.
`save_modified'
This field contains the time when the buffer was last saved, as an
integer. *Note Buffer Modification: (lispref)Buffer Modification.
`modtime'
This field contains the modification time of the visited file. It
is set when the file is written or read. Every time the buffer is
written to the file, this field is compared to the modification
time of the file. *Note Buffer Modification: (lispref)Buffer
Modification.
`auto_save_modified'
This field contains the time when the buffer was last auto-saved.
`last_window_start'
This field contains the `window-start' position in the buffer as of
the last time the buffer was displayed in a window.
`undo_list'
This field points to the buffer's undo list. *Note Undo:
(lispref)Undo.
`syntax_table_v'
This field contains the syntax table for the buffer. *Note Syntax
Tables: (lispref)Syntax Tables.
`downcase_table'
This field contains the conversion table for converting text to
lower case. *Note Case Tables: (lispref)Case Tables.
`upcase_table'
This field contains the conversion table for converting text to
upper case. *Note Case Tables: (lispref)Case Tables.
`case_canon_table'
This field contains the conversion table for canonicalizing text
for case-folding search. *Note Case Tables: (lispref)Case Tables.
`case_eqv_table'
This field contains the equivalence table for case-folding search.
*Note Case Tables: (lispref)Case Tables.
`display_table'
This field contains the buffer's display table, or `nil' if it
doesn't have one. *Note Display Tables: (lispref)Display Tables.
`markers'
This field contains the chain of all markers that currently point
into the buffer. Deletion of text in the buffer, and motion of
the buffer's gap, must check each of these markers and perhaps
update it. *Note Markers: (lispref)Markers.
`backed_up'
This field is a flag that tells whether a backup file has been
made for the visited file of this buffer.
`mark'
This field contains the mark for the buffer. The mark is a marker,
hence it is also included on the list `markers'. *Note The Mark:
(lispref)The Mark.
`mark_active'
This field is non-`nil' if the buffer's mark is active.
`local_var_alist'
This field contains the association list describing the variables
local in this buffer, and their values, with the exception of
local variables that have special slots in the buffer object.
(Those slots are omitted from this table.) *Note Buffer-Local
Variables: (lispref)Buffer-Local Variables.
`modeline_format'
This field contains a Lisp object which controls how to display
the mode line for this buffer. *Note Modeline Format:
(lispref)Modeline Format.
`base_buffer'
This field holds the buffer's base buffer (if it is an indirect
buffer), or `nil'.
File: internals.info, Node: MULE Character Sets and Encodings, Next: The Lisp Reader and Compiler, Prev: Buffers and Textual Representation, Up: Top
MULE Character Sets and Encodings
*********************************
Recall that there are two primary ways that text is represented in
XEmacs. The "buffer" representation sees the text as a series of bytes
(Bufbytes), with a variable number of bytes used per character. The
"character" representation sees the text as a series of integers
(Emchars), one per character. The character representation is a cleaner
representation from a theoretical standpoint, and is thus used in many
cases when lots of manipulations on a string need to be done. However,
the buffer representation is the standard representation used in both
Lisp strings and buffers, and because of this, it is the "default"
representation that text comes in. The reason for using this
representation is that it's compact and is compatible with ASCII.
* Menu:
* Character Sets::
* Encodings::
* Internal Mule Encodings::
* CCL::
File: internals.info, Node: Character Sets, Next: Encodings, Up: MULE Character Sets and Encodings
Character Sets
==============
A character set (or "charset") is an ordered set of characters. A
particular character in a charset is indexed using one or more
"position codes", which are non-negative integers. The number of
position codes needed to identify a particular character in a charset is
called the "dimension" of the charset. In XEmacs/Mule, all charsets
have dimension 1 or 2, and the size of all charsets (except for a few
special cases) is either 94, 96, 94 by 94, or 96 by 96. The range of
position codes used to index characters from any of these types of
character sets is as follows:
Charset type Position code 1 Position code 2
------------------------------------------------------------
94 33 - 126 N/A
96 32 - 127 N/A
94x94 33 - 126 33 - 126
96x96 32 - 127 32 - 127
Note that in the above cases position codes do not start at an
expected value such as 0 or 1. The reason for this will become clear
later.
For example, Latin-1 is a 96-character charset, and JISX0208 (the
Japanese national character set) is a 94x94-character charset.
[Note that, although the ranges above define the *valid* position
codes for a charset, some of the slots in a particular charset may in
fact be empty. This is the case for JISX0208, for example, where (e.g.)
all the slots whose first position code is in the range 118 - 127 are
empty.]
There are three charsets that do not follow the above rules. All of
them have one dimension, and have ranges of position codes as follows:
Charset name Position code 1
------------------------------------
ASCII 0 - 127
Control-1 0 - 31
Composite 0 - some large number
(The upper bound of the position code for composite characters has
not yet been determined, but it will probably be at least 16,383).
ASCII is the union of two subsidiary character sets: Printing-ASCII
(the printing ASCII character set, consisting of position codes 33 -
126, like for a standard 94-character charset) and Control-ASCII (the
non-printing characters that would appear in a binary file with codes 0
- 32 and 127).
Control-1 contains the non-printing characters that would appear in a
binary file with codes 128 - 159.
Composite contains characters that are generated by overstriking one
or more characters from other charsets.
Note that some characters in ASCII, and all characters in Control-1,
are "control" (non-printing) characters. These have no printed
representation but instead control some other function of the printing
(e.g. TAB or 8 moves the current character position to the next tab
stop). All other characters in all charsets are "graphic" (printing)
characters.
When a binary file is read in, the bytes in the file are assigned to
character sets as follows:
Bytes Character set Range
--------------------------------------------------
0 - 127 ASCII 0 - 127
128 - 159 Control-1 0 - 31
160 - 255 Latin-1 32 - 127
This is a bit ad-hoc but gets the job done.
File: internals.info, Node: Encodings, Next: Internal Mule Encodings, Prev: Character Sets, Up: MULE Character Sets and Encodings
Encodings
=========
An "encoding" is a way of numerically representing characters from
one or more character sets. If an encoding only encompasses one
character set, then the position codes for the characters in that
character set could be used directly. This is not possible, however, if
more than one character set is to be used in the encoding.
For example, the conversion detailed above between bytes in a binary
file and characters is effectively an encoding that encompasses the
three character sets ASCII, Control-1, and Latin-1 in a stream of 8-bit
bytes.
Thus, an encoding can be viewed as a way of encoding characters from
a specified group of character sets using a stream of bytes, each of
which contains a fixed number of bits (but not necessarily 8, as in the
common usage of "byte").
Here are descriptions of a couple of common encodings:
* Menu:
* Japanese EUC (Extended Unix Code)::
* JIS7::
File: internals.info, Node: Japanese EUC (Extended Unix Code), Next: JIS7, Up: Encodings
Japanese EUC (Extended Unix Code)
---------------------------------
This encompasses the character sets Printing-ASCII,
Japanese-JISSX0201, and Japanese-JISX0208-Kana (half-width katakana,
the right half of JISX0201). It uses 8-bit bytes.
Note that Printing-ASCII and Japanese-JISX0201-Kana are 94-character
charsets, while Japanese-JISX0208 is a 94x94-character charset.
The encoding is as follows:
Character set Representation (PC=position-code)
------------- --------------
Printing-ASCII PC1
Japanese-JISX0201-Kana 0x8E | PC1 + 0x80
Japanese-JISX0208 PC1 + 0x80 | PC2 + 0x80
Japanese-JISX0212 PC1 + 0x80 | PC2 + 0x80
File: internals.info, Node: JIS7, Prev: Japanese EUC (Extended Unix Code), Up: Encodings
JIS7
----
This encompasses the character sets Printing-ASCII,
Japanese-JISX0201-Roman (the left half of JISX0201; this character set
is very similar to Printing-ASCII and is a 94-character charset),
Japanese-JISX0208, and Japanese-JISX0201-Kana. It uses 7-bit bytes.
Unlike Japanese EUC, this is a "modal" encoding, which means that
there are multiple states that the encoding can be in, which affect how
the bytes are to be interpreted. Special sequences of bytes (called
"escape sequences") are used to change states.
The encoding is as follows:
Character set Representation (PC=position-code)
------------- --------------
Printing-ASCII PC1
Japanese-JISX0201-Roman PC1
Japanese-JISX0201-Kana PC1
Japanese-JISX0208 PC1 PC2
Escape sequence ASCII equivalent Meaning
--------------- ---------------- -------
0x1B 0x28 0x4A ESC ( J invoke Japanese-JISX0201-Roman
0x1B 0x28 0x49 ESC ( I invoke Japanese-JISX0201-Kana
0x1B 0x24 0x42 ESC $ B invoke Japanese-JISX0208
0x1B 0x28 0x42 ESC ( B invoke Printing-ASCII
Initially, Printing-ASCII is invoked.
File: internals.info, Node: Internal Mule Encodings, Next: CCL, Prev: Encodings, Up: MULE Character Sets and Encodings
Internal Mule Encodings
=======================
In XEmacs/Mule, each character set is assigned a unique number,
called a "leading byte". This is used in the encodings of a character.
Leading bytes are in the range 0x80 - 0xFF (except for ASCII, which has
a leading byte of 0), although some leading bytes are reserved.
Charsets whose leading byte is in the range 0x80 - 0x9F are called
"official" and are used for built-in charsets. Other charsets are
called "private" and have leading bytes in the range 0xA0 - 0xFF; these
are user-defined charsets.
More specifically:
Character set Leading byte
------------- ------------
ASCII 0
Composite 0x80
Dimension-1 Official 0x81 - 0x8D
(0x8E is free)
Control-1 0x8F
Dimension-2 Official 0x90 - 0x99
(0x9A - 0x9D are free;
0x9E and 0x9F are reserved)
Dimension-1 Private 0xA0 - 0xEF
Dimension-2 Private 0xF0 - 0xFF
There are two internal encodings for characters in XEmacs/Mule. One
is called "string encoding" and is an 8-bit encoding that is used for
representing characters in a buffer or string. It uses 1 to 4 bytes per
character. The other is called "character encoding" and is a 19-bit
encoding that is used for representing characters individually in a
variable.
(In the following descriptions, we'll ignore composite characters for
the moment. We also give a general (structural) overview first,
followed later by the exact details.)
* Menu:
* Internal String Encoding::
* Internal Character Encoding::
File: internals.info, Node: Internal String Encoding, Next: Internal Character Encoding, Up: Internal Mule Encodings
Internal String Encoding
------------------------
ASCII characters are encoded using their position code directly.
Other characters are encoded using their leading byte followed by their
position code(s) with the high bit set. Characters in private character
sets have their leading byte prefixed with a "leading byte prefix",
which is either 0x9E or 0x9F. (No character sets are ever assigned these
leading bytes.) Specifically:
Character set Encoding (PC=position-code, LB=leading-byte)
------------- --------
ASCII PC-1 |
Control-1 LB | PC1 + 0xA0 |
Dimension-1 official LB | PC1 + 0x80 |
Dimension-1 private 0x9E | LB | PC1 + 0x80 |
Dimension-2 official LB | PC1 + 0x80 | PC2 + 0x80 |
Dimension-2 private 0x9F | LB | PC1 + 0x80 | PC2 + 0x80
The basic characteristic of this encoding is that the first byte of
all characters is in the range 0x00 - 0x9F, and the second and
following bytes of all characters is in the range 0xA0 - 0xFF. This
means that it is impossible to get out of sync, or more specifically:
1. Given any byte position, the beginning of the character it is
within can be determined in constant time.
2. Given any byte position at the beginning of a character, the
beginning of the next character can be determined in constant time.
3. Given any byte position at the beginning of a character, the
beginning of the previous character can be determined in constant
time.
4. Textual searches can simply treat encoded strings as if they were
encoded in a one-byte-per-character fashion rather than the actual
multi-byte encoding.
None of the standard non-modal encodings meet all of these
conditions. For example, EUC satisfies only (2) and (3), while
Shift-JIS and Big5 (not yet described) satisfy only (2). (All non-modal
encodings must satisfy (2), in order to be unambiguous.)
File: internals.info, Node: Internal Character Encoding, Prev: Internal String Encoding, Up: Internal Mule Encodings
Internal Character Encoding
---------------------------
One 19-bit word represents a single character. The word is
separated into three fields:
Bit number: 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00
<------------> <------------------> <------------------>
Field: 1 2 3
Note that fields 2 and 3 hold 7 bits each, while field 1 holds 5
bits.
Character set Field 1 Field 2 Field 3
------------- ------- ------- -------
ASCII 0 0 PC1
range: (00 - 7F)
Control-1 0 1 PC1
range: (00 - 1F)
Dimension-1 official 0 LB - 0x80 PC1
range: (01 - 0D) (20 - 7F)
Dimension-1 private 0 LB - 0x80 PC1
range: (20 - 6F) (20 - 7F)
Dimension-2 official LB - 0x8F PC1 PC2
range: (01 - 0A) (20 - 7F) (20 - 7F)
Dimension-2 private LB - 0xE1 PC1 PC2
range: (0F - 1E) (20 - 7F) (20 - 7F)
Composite 0x1F ? ?
Note that character codes 0 - 255 are the same as the "binary
encoding" described above.
File: internals.info, Node: CCL, Prev: Internal Mule Encodings, Up: MULE Character Sets and Encodings
CCL
===
CCL PROGRAM SYNTAX:
CCL_PROGRAM := (CCL_MAIN_BLOCK
[ CCL_EOF_BLOCK ])
CCL_MAIN_BLOCK := CCL_BLOCK
CCL_EOF_BLOCK := CCL_BLOCK
CCL_BLOCK := STATEMENT | (STATEMENT [STATEMENT ...])
STATEMENT :=
SET | IF | BRANCH | LOOP | REPEAT | BREAK
| READ | WRITE
SET := (REG = EXPRESSION) | (REG SELF_OP EXPRESSION)
| INT-OR-CHAR
EXPRESSION := ARG | (EXPRESSION OP ARG)
IF := (if EXPRESSION CCL_BLOCK CCL_BLOCK)
BRANCH := (branch EXPRESSION CCL_BLOCK [CCL_BLOCK ...])
LOOP := (loop STATEMENT [STATEMENT ...])
BREAK := (break)
REPEAT := (repeat)
| (write-repeat [REG | INT-OR-CHAR | string])
| (write-read-repeat REG [INT-OR-CHAR | string | ARRAY]?)
READ := (read REG) | (read REG REG)
| (read-if REG ARITH_OP ARG CCL_BLOCK CCL_BLOCK)
| (read-branch REG CCL_BLOCK [CCL_BLOCK ...])
WRITE := (write REG) | (write REG REG)
| (write INT-OR-CHAR) | (write STRING) | STRING
| (write REG ARRAY)
END := (end)
REG := r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7
ARG := REG | INT-OR-CHAR
OP := + | - | * | / | % | & | '|' | ^ | << | >> | <8 | >8 | //
| < | > | == | <= | >= | !=
SELF_OP :=
+= | -= | *= | /= | %= | &= | '|=' | ^= | <<= | >>=
ARRAY := '[' INT-OR-CHAR ... ']'
INT-OR-CHAR := INT | CHAR
MACHINE CODE:
The machine code consists of a vector of 32-bit words.
The first such word specifies the start of the EOF section of the code;
this is the code executed to handle any stuff that needs to be done
(e.g. designating back to ASCII and left-to-right mode) after all
other encoded/decoded data has been written out. This is not used for
charset CCL programs.
REGISTER: 0..7 -- refered by RRR or rrr
OPERATOR BIT FIELD (27-bit): XXXXXXXXXXXXXXX RRR TTTTT
TTTTT (5-bit): operator type
RRR (3-bit): register number
XXXXXXXXXXXXXXXX (15-bit):
CCCCCCCCCCCCCCC: constant or address
000000000000rrr: register number
AAAA: 00000 +
00001 -
00010 *
00011 /
00100 %
00101 &
00110 |
00111 ~
01000 <<
01001 >>
01010 <8
01011 >8
01100 //
01101 not used
01110 not used
01111 not used
10000 <
10001 >
10010 ==
10011 <=
10100 >=
10101 !=
OPERATORS: TTTTT RRR XX..
SetCS: 00000 RRR C...C RRR = C...C
SetCL: 00001 RRR ..... RRR = c...c
c.............c
SetR: 00010 RRR ..rrr RRR = rrr
SetA: 00011 RRR ..rrr RRR = array[rrr]
C.............C size of array = C...C
c.............c contents = c...c
Jump: 00100 000 c...c jump to c...c
JumpCond: 00101 RRR c...c if (!RRR) jump to c...c
WriteJump: 00110 RRR c...c Write1 RRR, jump to c...c
WriteReadJump: 00111 RRR c...c Write1, Read1 RRR, jump to c...c
WriteCJump: 01000 000 c...c Write1 C...C, jump to c...c
C...C
WriteCReadJump: 01001 RRR c...c Write1 C...C, Read1 RRR,
C.............C and jump to c...c
WriteSJump: 01010 000 c...c WriteS, jump to c...c
C.............C
S.............S
...
WriteSReadJump: 01011 RRR c...c WriteS, Read1 RRR, jump to c...c
C.............C
S.............S
...
WriteAReadJump: 01100 RRR c...c WriteA, Read1 RRR, jump to c...c
C.............C size of array = C...C
c.............c contents = c...c
...
Branch: 01101 RRR C...C if (RRR >= 0 && RRR < C..)
c.............c branch to (RRR+1)th address
Read1: 01110 RRR ... read 1-byte to RRR
Read2: 01111 RRR ..rrr read 2-byte to RRR and rrr
ReadBranch: 10000 RRR C...C Read1 and Branch
c.............c
...
Write1: 10001 RRR ..... write 1-byte RRR
Write2: 10010 RRR ..rrr write 2-byte RRR and rrr
WriteC: 10011 000 ..... write 1-char C...CC
C.............C
WriteS: 10100 000 ..... write C..-byte of string
C.............C
S.............S
...
WriteA: 10101 RRR ..... write array[RRR]
C.............C size of array = C...C
c.............c contents = c...c
...
End: 10110 000 ..... terminate the execution
SetSelfCS: 10111 RRR C...C RRR AAAAA= C...C
..........AAAAA
SetSelfCL: 11000 RRR ..... RRR AAAAA= c...c
c.............c
..........AAAAA
SetSelfR: 11001 RRR ..Rrr RRR AAAAA= rrr
..........AAAAA
SetExprCL: 11010 RRR ..Rrr RRR = rrr AAAAA c...c
c.............c
..........AAAAA
SetExprR: 11011 RRR ..rrr RRR = rrr AAAAA Rrr
............Rrr
..........AAAAA
JumpCondC: 11100 RRR c...c if !(RRR AAAAA C..) jump to c...c
C.............C
..........AAAAA
JumpCondR: 11101 RRR c...c if !(RRR AAAAA rrr) jump to c...c
............rrr
..........AAAAA
ReadJumpCondC: 11110 RRR c...c Read1 and JumpCondC
C.............C
..........AAAAA
ReadJumpCondR: 11111 RRR c...c Read1 and JumpCondR
............rrr
..........AAAAA
File: internals.info, Node: The Lisp Reader and Compiler, Next: Lstreams, Prev: MULE Character Sets and Encodings, Up: Top
The Lisp Reader and Compiler
****************************
Not yet documented.
File: internals.info, Node: Lstreams, Next: Consoles; Devices; Frames; Windows, Prev: The Lisp Reader and Compiler, Up: Top
Lstreams
********
An "lstream" is an internal Lisp object that provides a generic
buffering stream implementation. Conceptually, you send data to the
stream or read data from the stream, not caring what's on the other end
of the stream. The other end could be another stream, a file
descriptor, a stdio stream, a fixed block of memory, a reallocating
block of memory, etc. The main purpose of the stream is to provide a
standard interface and to do buffering. Macros are defined to read or
write characters, so the calling functions do not have to worry about
blocking data together in order to achieve efficiency.
* Menu:
* Creating an Lstream:: Creating an lstream object.
* Lstream Types:: Different sorts of things that are streamed.
* Lstream Functions:: Functions for working with lstreams.
* Lstream Methods:: Creating new lstream types.
File: internals.info, Node: Creating an Lstream, Next: Lstream Types, Up: Lstreams
Creating an Lstream
===================
Lstreams come in different types, depending on what is being
interfaced to. Although the primitive for creating new lstreams is
`Lstream_new()', generally you do not call this directly. Instead, you
call some type-specific creation function, which creates the lstream
and initializes it as appropriate for the particular type.
All lstream creation functions take a MODE argument, specifying what
mode the lstream should be opened as. This controls whether the
lstream is for input and output, and optionally whether data should be
blocked up in units of MULE characters. Note that some types of
lstreams can only be opened for input; others only for output; and
others can be opened either way. #### Richard Mlynarik thinks that
there should be a strict separation between input and output streams,
and he's probably right.
MODE is a string, one of
`"r"'
Open for reading.
`"w"'
Open for writing.
`"rc"'
Open for reading, but "read" never returns partial MULE characters.
`"wc"'
Open for writing, but never writes partial MULE characters.
File: internals.info, Node: Lstream Types, Next: Lstream Functions, Prev: Creating an Lstream, Up: Lstreams
Lstream Types
=============
stdio
filedesc
lisp-string
fixed-buffer
resizing-buffer
dynarr
lisp-buffer
print
decoding
encoding
File: internals.info, Node: Lstream Functions, Next: Lstream Methods, Prev: Lstream Types, Up: Lstreams
Lstream Functions
=================
- Function: Lstream * Lstream_new (Lstream_implementation *IMP, CONST
char *MODE)
Allocate and return a new Lstream. This function is not really
meant to be called directly; rather, each stream type should
provide its own stream creation function, which creates the stream
and does any other necessary creation stuff (e.g. opening a file).
- Function: void Lstream_set_buffering (Lstream *LSTR,
Lstream_buffering BUFFERING, int BUFFERING_SIZE)
Change the buffering of a stream. See `lstream.h'. By default the
buffering is `STREAM_BLOCK_BUFFERED'.
- Function: int Lstream_flush (Lstream *LSTR)
Flush out any pending unwritten data in the stream. Clear any
buffered input data. Returns 0 on success, -1 on error.
- Macro: int Lstream_putc (Lstream *STREAM, int C)
Write out one byte to the stream. This is a macro and so it is
very efficient. The C argument is only evaluated once but the
STREAM argument is evaluated more than once. Returns 0 on
success, -1 on error.
- Macro: int Lstream_getc (Lstream *STREAM)
Read one byte from the stream. This is a macro and so it is very
efficient. The STREAM argument is evaluated more than once.
Return value is -1 for EOF or error.
- Macro: void Lstream_ungetc (Lstream *STREAM, int C)
Push one byte back onto the input queue. This will be the next
byte read from the stream. Any number of bytes can be pushed back
and will be read in the reverse order they were pushed back - most
recent first. (This is necessary for consistency - if there are a
number of bytes that have been unread and I read and unread a
byte, it needs to be the first to be read again.) This is a macro
and so it is very efficient. The C argument is only evaluated
once but the STREAM argument is evaluated more than once.
- Function: int Lstream_fputc (Lstream *STREAM, int C)
- Function: int Lstream_fgetc (Lstream *STREAM)
- Function: void Lstream_fungetc (Lstream *STREAM, int C)
Function equivalents of the above macros.
- Function: int Lstream_read (Lstream *STREAM, void *DATA, int SIZE)
Read SIZE bytes of DATA from the stream. Return the number of
bytes read. 0 means EOF. -1 means an error occurred and no bytes
were read.
- Function: int Lstream_write (Lstream *STREAM, void *DATA, int SIZE)
Write SIZE bytes of DATA to the stream. Return the number of
bytes written. -1 means an error occurred and no bytes were
written.
- Function: void Lstream_unread (Lstream *STREAM, void *DATA, int SIZE)
Push back SIZE bytes of DATA onto the input queue. The next call
to `Lstream_read()' with the same size will read the same bytes
back. Note that this will be the case even if there is other
pending unread data.
- Function: int Lstream_close (Lstream *STREAM)
Close the stream. All data will be flushed out.
- Function: void Lstream_reopen (Lstream *STREAM)
Reopen a closed stream. This enables I/O on it again. This is not
meant to be called except from a wrapper routine that reinitializes
variables and such - the close routine may well have freed some
necessary storage structures, for example.
- Function: void Lstream_rewind (Lstream *STREAM)
Rewind the stream to the beginning.
File: internals.info, Node: Lstream Methods, Prev: Lstream Functions, Up: Lstreams
Lstream Methods
===============
- Lstream Method: int reader (Lstream *STREAM, unsigned char *DATA,
int SIZE)
Read some data from the stream's end and store it into DATA, which
can hold SIZE bytes. Return the number of bytes read. A return
value of 0 means no bytes can be read at this time. This may be
because of an EOF, or because there is a granularity greater than
one byte that the stream imposes on the returned data, and SIZE is
less than this granularity. (This will happen frequently for
streams that need to return whole characters, because
`Lstream_read()' calls the reader function repeatedly until it has
the number of bytes it wants or until 0 is returned.) The lstream
functions do not treat a 0 return as EOF or do anything special;
however, the calling function will interpret any 0 it gets back as
EOF. This will normally not happen unless the caller calls
`Lstream_read()' with a very small size.
This function can be `NULL' if the stream is output-only.
- Lstream Method: int writer (Lstream *STREAM, CONST unsigned char
*DATA, int SIZE)
Send some data to the stream's end. Data to be sent is in DATA
and is SIZE bytes. Return the number of bytes sent. This
function can send and return fewer bytes than is passed in; in that
case, the function will just be called again until there is no
data left or 0 is returned. A return value of 0 means that no
more data can be currently stored, but there is no error; the data
will be squirreled away until the writer can accept data. (This is
useful, e.g., if you're dealing with a non-blocking file
descriptor and are getting `EWOULDBLOCK' errors.) This function
can be `NULL' if the stream is input-only.
- Lstream Method: int rewinder (Lstream *STREAM)
Rewind the stream. If this is `NULL', the stream is not seekable.
- Lstream Method: int seekable_p (Lstream *STREAM)
Indicate whether this stream is seekable - i.e. it can be rewound.
This method is ignored if the stream does not have a rewind
method. If this method is not present, the result is determined
by whether a rewind method is present.
- Lstream Method: int flusher (Lstream *STREAM)
Perform any additional operations necessary to flush the data in
this stream.
- Lstream Method: int pseudo_closer (Lstream *STREAM)
- Lstream Method: int closer (Lstream *STREAM)
Perform any additional operations necessary to close this stream
down. May be `NULL'. This function is called when
`Lstream_close()' is called or when the stream is
garbage-collected. When this function is called, all pending data
in the stream will already have been written out.
- Lstream Method: Lisp_Object marker (Lisp_Object LSTREAM, void
(*MARKFUN) (Lisp_Object))
Mark this object for garbage collection. Same semantics as a
standard `Lisp_Object' marker. This function can be `NULL'.
File: internals.info, Node: Consoles; Devices; Frames; Windows, Next: The Redisplay Mechanism, Prev: Lstreams, Up: Top
Consoles; Devices; Frames; Windows
**********************************
* Menu:
* Introduction to Consoles; Devices; Frames; Windows::
* Point::
* Window Hierarchy::
* The Window Object::
File: internals.info, Node: Introduction to Consoles; Devices; Frames; Windows, Next: Point, Up: Consoles; Devices; Frames; Windows
Introduction to Consoles; Devices; Frames; Windows
==================================================
A window-system window that you see on the screen is called a
"frame" in Emacs terminology. Each frame is subdivided into one or
more non-overlapping panes, called (confusingly) "windows". Each
window displays the text of a buffer in it. (See above on Buffers.) Note
that buffers and windows are independent entities: Two or more windows
can be displaying the same buffer (potentially in different locations),
and a buffer can be displayed in no windows.
A single display screen that contains one or more frames is called a
"display". Under most circumstances, there is only one display.
However, more than one display can exist, for example if you have a
"multi-headed" console, i.e. one with a single keyboard but multiple
displays. (Typically in such a situation, the various displays act like
one large display, in that the mouse is only in one of them at a time,
and moving the mouse off of one moves it into another.) In some cases,
the different displays will have different characteristics, e.g. one
color and one mono.
XEmacs can display frames on multiple displays. It can even deal
simultaneously with frames on multiple keyboards (called "consoles" in
XEmacs terminology). Here is one case where this might be useful: You
are using XEmacs on your workstation at work, and leave it running.
Then you go home and dial in on a TTY line, and you can use the
already-running XEmacs process to display another frame on your local
TTY.
Thus, there is a hierarchy console -> display -> frame -> window.
There is a separate Lisp object type for each of these four concepts.
Furthermore, there is logically a "selected console", "selected
display", "selected frame", and "selected window". Each of these
objects is distinguished in various ways, such as being the default
object for various functions that act on objects of that type. Note
that every containing object rememembers the "selected" object among
the objects that it contains: e.g. not only is there a selected window,
but every frame remembers the last window in it that was selected, and
changing the selected frame causes the remembered window within it to
become the selected window. Similar relationships apply for consoles
to devices and devices to frames.
File: internals.info, Node: Point, Next: Window Hierarchy, Prev: Introduction to Consoles; Devices; Frames; Windows, Up: Consoles; Devices; Frames; Windows
Point
=====
Recall that every buffer has a current insertion position, called
"point". Now, two or more windows may be displaying the same buffer,
and the text cursor in the two windows (i.e. `point') can be in two
different places. You may ask, how can that be, since each buffer has
only one value of `point'? The answer is that each window also has a
value of `point' that is squirreled away in it. There is only one
selected window, and the value of "point" in that buffer corresponds to
that window. When the selected window is changed from one window to
another displaying the same buffer, the old value of `point' is stored
into the old window's "point" and the value of `point' from the new
window is retrieved and made the value of `point' in the buffer. This
means that `window-point' for the selected window is potentially
inaccurate, and if you want to retrieve the correct value of `point'
for a window, you must special-case on the selected window and retrieve
the buffer's point instead. This is related to why
`save-window-excursion' does not save the selected window's value of
`point'.
File: internals.info, Node: Window Hierarchy, Next: The Window Object, Prev: Point, Up: Consoles; Devices; Frames; Windows
Window Hierarchy
================
If a frame contains multiple windows (panes), they are always created
by splitting an existing window along the horizontal or vertical axis.
Terminology is a bit confusing here: to "split a window horizontally"
means to create two side-by-side windows, i.e. to make a *vertical* cut
in a window. Likewise, to "split a window vertically" means to create
two windows, one above the other, by making a *horizontal* cut.
If you split a window and then split again along the same axis, you
will end up with a number of panes all arranged along the same axis.
The precise way in which the splits were made should not be important,
and this is reflected internally. Internally, all windows are arranged
in a tree, consisting of two types of windows, "combination" windows
(which have children, and are covered completely by those children) and
"leaf" windows, which have no children and are visible. Every
combination window has two or more children, all arranged along the same
axis. There are (logically) two subtypes of windows, depending on
whether their children are horizontally or vertically arrayed. There is
always one root window, which is either a leaf window (if the frame
contains only one window) or a combination window (if the frame contains
more than one window). In the latter case, the root window will have
two or more children, either horizontally or vertically arrayed, and
each of those children will be either a leaf window or another
combination window.
Here are some rules:
1. Horizontal combination windows can never have children that are
horizontal combination windows; same for vertical.
2. Only leaf windows can be split (obviously) and this splitting does
one of two things: (a) turns the leaf window into a combination
window and creates two new leaf children, or (b) turns the leaf
window into one of the two new leaves and creates the other leaf.
Rule (1) dictates which of these two outcomes happens.
3. Every combination window must have at least two children.
4. Leaf windows can never become combination windows. They can be
deleted, however. If this results in a violation of (3), the
parent combination window also gets deleted.
5. All functions that accept windows must be prepared to accept
combination windows, and do something sane (e.g. signal an error
if so). Combination windows *do* escape to the Lisp level.
6. All windows have three fields governing their contents: these are
"hchild" (a list of horizontally-arrayed children), "vchild" (a
list of vertically-arrayed children), and "buffer" (the buffer
contained in a leaf window). Exactly one of these will be
non-nil. Remember that "horizontally-arrayed" means
"side-by-side" and "vertically-arrayed" means "one above the
other".
7. Leaf windows also have markers in their `start' (the first buffer
position displayed in the window) and `pointm' (the window's
stashed value of `point' - see above) fields, while combination
windows have nil in these fields.
8. The list of children for a window is threaded through the `next'
and `prev' fields of each child window.
9. *Deleted windows can be undeleted*. This happens as a result of
restoring a window configuration, and is unlike frames, displays,
and consoles, which, once deleted, can never be restored.
Deleting a window does nothing except set a special `dead' bit to
1 and clear out the `next', `prev', `hchild', and `vchild' fields,
for GC purposes.
10. Most frames actually have two top-level windows - one for the
minibuffer and one (the "root") for everything else. The modeline
(if present) separates these two. The `next' field of the root
points to the minibuffer, and the `prev' field of the minibuffer
points to the root. The other `next' and `prev' fields are `nil',
and the frame points to both of these windows. Minibuffer-less
frames have no minibuffer window, and the `next' and `prev' of the
root window are `nil'. Minibuffer-only frames have no root
window, and the `next' of the minibuffer window is `nil' but the
`prev' points to itself. (#### This is an artifact that should be
fixed.)
File: internals.info, Node: The Window Object, Prev: Window Hierarchy, Up: Consoles; Devices; Frames; Windows
The Window Object
=================
Windows have the following accessible fields:
`frame'
The frame that this window is on.
`mini_p'
Non-`nil' if this window is a minibuffer window.
`buffer'
The buffer that the window is displaying. This may change often
during the life of the window.
`dedicated'
Non-`nil' if this window is dedicated to its buffer.
`pointm'
This is the value of point in the current buffer when this window
is selected; when it is not selected, it retains its previous
value.
`start'
The position in the buffer that is the first character to be
displayed in the window.
`force_start'
If this flag is non-`nil', it says that the window has been
scrolled explicitly by the Lisp program. This affects what the
next redisplay does if point is off the screen: instead of
scrolling the window to show the text around point, it moves point
to a location that is on the screen.
`last_modified'
The `modified' field of the window's buffer, as of the last time a
redisplay completed in this window.
`last_point'
The buffer's value of point, as of the last time a redisplay
completed in this window.
`left'
This is the left-hand edge of the window, measured in columns.
(The leftmost column on the screen is column 0.)
`top'
This is the top edge of the window, measured in lines. (The top
line on the screen is line 0.)
`height'
The height of the window, measured in lines.
`width'
The width of the window, measured in columns.
`next'
This is the window that is the next in the chain of siblings. It
is `nil' in a window that is the rightmost or bottommost of a
group of siblings.
`prev'
This is the window that is the previous in the chain of siblings.
It is `nil' in a window that is the leftmost or topmost of a group
of siblings.
`parent'
Internally, XEmacs arranges windows in a tree; each group of
siblings has a parent window whose area includes all the siblings.
This field points to a window's parent.
Parent windows do not display buffers, and play little role in
display except to shape their child windows. Emacs Lisp programs
usually have no access to the parent windows; they operate on the
windows at the leaves of the tree, which actually display buffers.
`hscroll'
This is the number of columns that the display in the window is
scrolled horizontally to the left. Normally, this is 0.
`use_time'
This is the last time that the window was selected. The function
`get-lru-window' uses this field.
`display_table'
The window's display table, or `nil' if none is specified for it.
`update_mode_line'
Non-`nil' means this window's mode line needs to be updated.
`base_line_number'
The line number of a certain position in the buffer, or `nil'.
This is used for displaying the line number of point in the mode
line.
`base_line_pos'
The position in the buffer for which the line number is known, or
`nil' meaning none is known.
`region_showing'
If the region (or part of it) is highlighted in this window, this
field holds the mark position that made one end of that region.
Otherwise, this field is `nil'.
File: internals.info, Node: The Redisplay Mechanism, Next: Extents, Prev: Consoles; Devices; Frames; Windows, Up: Top
The Redisplay Mechanism
***********************
The redisplay mechanism is one of the most complicated sections of
XEmacs, especially from a conceptual standpoint. This is doubly so
because, unlike for the basic aspects of the Lisp interpreter, the
computer science theories of how to efficiently handle redisplay are not
well-developed.
When working with the redisplay mechanism, remember the Golden Rules
of Redisplay:
1. It Is Better To Be Correct Than Fast.
2. Thou Shalt Not Run Elisp From Within Redisplay.
3. It Is Better To Be Fast Than Not To Be.
* Menu:
* Critical Redisplay Sections::
* Line Start Cache::
File: internals.info, Node: Critical Redisplay Sections, Next: Line Start Cache, Up: The Redisplay Mechanism
Critical Redisplay Sections
===========================
Within this section, we are defenseless and assume that the
following cannot happen:
1. garbage collection
2. Lisp code evaluation
3. frame size changes
We ensure (3) by calling `hold_frame_size_changes()', which will
cause any pending frame size changes to get put on hold till after the
end of the critical section. (1) follows automatically if (2) is met.
#### Unfortunately, there are some places where Lisp code can be called
within this section. We need to remove them.
If `Fsignal()' is called during this critical section, we will
`abort()'.
If garbage collection is called during this critical section, we
simply return. #### We should abort instead.
#### If a frame-size change does occur we should probably actually
be preempting redisplay.